82 research outputs found

    Joint Video and Text Parsing for Understanding Events and Answering Queries

    Full text link
    We propose a framework for parsing video and text jointly for understanding events and answering user queries. Our framework produces a parse graph that represents the compositional structures of spatial information (objects and scenes), temporal information (actions and events) and causal information (causalities between events and fluents) in the video and text. The knowledge representation of our framework is based on a spatial-temporal-causal And-Or graph (S/T/C-AOG), which jointly models possible hierarchical compositions of objects, scenes and events as well as their interactions and mutual contexts, and specifies the prior probabilistic distribution of the parse graphs. We present a probabilistic generative model for joint parsing that captures the relations between the input video/text, their corresponding parse graphs and the joint parse graph. Based on the probabilistic model, we propose a joint parsing system consisting of three modules: video parsing, text parsing and joint inference. Video parsing and text parsing produce two parse graphs from the input video and text respectively. The joint inference module produces a joint parse graph by performing matching, deduction and revision on the video and text parse graphs. The proposed framework has the following objectives: Firstly, we aim at deep semantic parsing of video and text that goes beyond the traditional bag-of-words approaches; Secondly, we perform parsing and reasoning across the spatial, temporal and causal dimensions based on the joint S/T/C-AOG representation; Thirdly, we show that deep joint parsing facilitates subsequent applications such as generating narrative text descriptions and answering queries in the forms of who, what, when, where and why. We empirically evaluated our system based on comparison against ground-truth as well as accuracy of query answering and obtained satisfactory results

    Gen-LaneNet: A Generalized and Scalable Approach for 3D Lane Detection

    Full text link
    We present a generalized and scalable method, called Gen-LaneNet, to detect 3D lanes from a single image. The method, inspired by the latest state-of-the-art 3D-LaneNet, is a unified framework solving image encoding, spatial transform of features and 3D lane prediction in a single network. However, we propose unique designs for Gen-LaneNet in two folds. First, we introduce a new geometry-guided lane anchor representation in a new coordinate frame and apply a specific geometric transformation to directly calculate real 3D lane points from the network output. We demonstrate that aligning the lane points with the underlying top-view features in the new coordinate frame is critical towards a generalized method in handling unfamiliar scenes. Second, we present a scalable two-stage framework that decouples the learning of image segmentation subnetwork and geometry encoding subnetwork. Compared to 3D-LaneNet, the proposed Gen-LaneNet drastically reduces the amount of 3D lane labels required to achieve a robust solution in real-world application. Moreover, we release a new synthetic dataset and its construction strategy to encourage the development and evaluation of 3D lane detection methods. In experiments, we conduct extensive ablation study to substantiate the proposed Gen-LaneNet significantly outperforms 3D-LaneNet in average precision(AP) and F-score

    Isolated pulmonary cryptococcosis in an immunocompetent boy

    Get PDF
    Pulmonary cryptococcosis is rare in immunocompetent subjects. Here, we present the case of a 16-year-old boy who was referred to our pediatric department for the management of multiple consolidations detected on chest radiography, which was routinely performed when the patient was being evaluated for an ankle fracture. Fine needle aspiration biopsy was performed, and the definitive diagnosis was established as cryptococcal pneumonia. After 8 weeks of antifungal treatment, the pulmonary nodules on the chest radiographs disappeared

    Takayasu's Arteritis Treated by Percutaneous Transluminal Angioplasty with Stenting in the Descending Aorta

    Get PDF
    A 17-yr-old young woman was referred to our hospital with a 2-yr history of claudication of the lower extremities and severe arterial hypertension. Physical examination revealed significantly different blood pressures between both arms (160/92 and 180/95 mmHg) and legs (92/61 and 82/57 mmHg). The hematological and biochemical values were within their normal ranges, except for the increased erythrocyte sedimentation rate (83 mm/hr) and C-reactive protein (6.19 mg/L). On 3-dimensional computed tomographic angiography, the ascending aorta, the aortic arch and its branches, and the thoracic and, descending aorta, but not the renal artery, were shown to be stenotic. The diagnosis of type IIb Takayasu's arteritis was made according to the new angiographic classification of Takayasu's arteritis, Takyasu conference 1994. Percutaneous transluminal angioplasty with stenting was performed on the thoracic and abdominal aorta. After the interventional procedures, the upper extremity blood pressure improved from 162/101 mmHg to 132/85 mmHg, respectively. She has been free of claudication and there have been no cardiac events during 2-yr of clinical follow-up

    Clinical features and outcomes of gastric variceal bleeding: retrospective Korean multicenter data

    Get PDF
    Background/AimsWhile gastric variceal bleeding (GVB) is not as prevalent as esophageal variceal bleeding, it is reportedly more serious, with high failure rates of the initial hemostasis (>30%), and has a worse prognosis than esophageal variceal bleeding. However, there is limited information regarding hemostasis and the prognosis for GVB. The aim of this study was to determine retrospectively the clinical outcomes of GVB in a multicenter study in Korea.MethodsThe data of 1,308 episodes of GVB (males:females=1062:246, age=55.0±11.0 years, mean±SD) were collected from 24 referral hospital centers in South Korea between March 2003 and December 2008. The rates of initial hemostasis failure, rebleeding, and mortality within 5 days and 6 weeks of the index bleed were evaluated.ResultsThe initial hemostasis failed in 6.1% of the patients, and this was associated with the Child-Pugh score [odds ratio (OR)=1.619; P<0.001] and the treatment modality: endoscopic variceal ligation, endoscopic variceal obturation, and balloon-occluded retrograde transvenous obliteration vs. endoscopic sclerotherapy, transjugular intrahepatic portosystemic shunt, and balloon tamponade (OR=0.221, P<0.001). Rebleeding developed in 11.5% of the patients, and was significantly associated with Child-Pugh score (OR=1.159, P<0.001) and treatment modality (OR=0.619, P=0.026). The GVB-associated mortality was 10.3%; mortality in these cases was associated with Child-Pugh score (OR=1.795, P<0.001) and the treatment modality for the initial hemostasis (OR=0.467, P=0.001).ConclusionsThe clinical outcome for GVB was better for the present cohort than in previous reports. Initial hemostasis failure, rebleeding, and mortality due to GVB were universally associated with the severity of liver cirrhosis

    Registration of Multimodal Fluorescein Images Sequence of the Retina

    No full text
    In this study we present a Y-feature extraction method for registering color and fluorescein angiograms of the retina. The registration of multimodal fluorescein imagery requires the identification of strong geometric features in the retinal images that are invariant across modalities and to temporal grey level variations due to the propagation of the dye in the vessels. The most informative features, invariant across the considered modalities, are the locations of vessels ’ ramification: the so-called Y-features. We propose a Y-feature extraction method based on the local classification of image gradient information and an articulated model. An appropriate cost function is proposed for fitting the model using a gradient-based approach. The fitted Y-features are subsequently matched across the images for registering the color and fluorescein images. Experimental results obtained on a large database validate the proposed method. 1

    M V -Algebras of Continuous Functions and l-Monoids

    No full text
    Abstract. A. Di Nola &amp; S. Sessa [8] showed that two compact spaces X and Y are homeomorphic iff the M V -algebras C(X, I) and C(Y, I) of continuous functions defined on X and Y respectively are isomorphic. And they proved that A is a semisimple M V -algebra iff A is a subalgebra of C(X) for some compact Hausdorff space X. In this paper, firstly by use of functorial argument, we show these characterization theorems. Furthermore we obtain some other functorial results between topological spaces and M V -algebras. Secondly as a classical problem, we find a necessary and sufficient condition on a given residuated l-monoid that it is segmenently embedded into an l-group with order unit

    1 Joint Video and Text Parsing for Understanding Events and Answering Queries

    No full text
    Abstract—We propose a framework for parsing video and text jointly for understanding events and answering user queries. Our framework produces a parse graph that represents the compositional structures of spatial information (objects and scenes), temporal information (actions and events) and causal information (causalities between events and fluents) in the video and text. The knowledge representation of our framework is based on a spatial-temporal-causal And-Or graph (S/T/C-AOG), which jointly models possible hierarchical compositions of objects, scenes and events as well as their interactions and mutual contexts, and specifies the prior probabilistic distribution of the parse graphs. We present a probabilistic generative model for joint parsing that captures the relations between the input video/text, their corresponding parse graphs and the joint parse graph. Based on the probabilistic model, we propose a joint parsing system consisting of three modules: video parsing, text parsing and joint inference. Video parsing and text parsing produce two parse graphs from the input video and text respectively. The joint inference module produces a joint parse graph by performing matching, deduction and revision on the video and text parse graphs. The proposed framework has the following objectives: Firstly, we aim at deep semantic parsing of video and text that goes beyond the traditional bag-of-words approaches; Secondly, we perform parsing and reasoning across the spatial, temporal and causal dimensions based on the joint S/T/C-AOG representation; Thirdly, we show that deep joint parsing facilitates subsequent applications such as generating narrative text descriptions and answering queries in the forms of who, what, when, where and why. We empirically evaluated our system based on comparison against groundtruth as well as accuracy of query answering and obtained satisfactory results
    • …
    corecore